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1  Introduction 


A  common  approach  to  model-based  object  recognition  is  the  hypothesis  and  verification 
paradigm.  The  pose  in  the  image  of  a  known  object  is  hypothesized  then  evaluated  and 
verified  to  localize  the  object  in  the  image.  The  hypothesis  stage  often  consists  of  deriving  a 
set  of  geometric  features  from  both  the  modeled  object  and  the  image  data  and  determining 
correspondences  between  pairs  of  image  and  model  features.  From  these  correspondences  a 
rigid  transformation  on  the  model  features  (and  the  model  itself)  is  computed  aligning  the 
model  geometrically  with  an  hypothesized  instance  of  the  object  in  the  image.  This  forms 
the  object  pose  hypothesis  which  can  then  be  evaluated  and  verified  based  on  comparing 
the  transformed  model  with  the  underlying  image  data. 

The  central  problem  in  this  method  of  hypothesis  construction  is  determining  the  sets 
of  corresponding  image  and  model  features.  This  feature  matching  is  difficult  in  the  do¬ 
main  of  object  localization  for  three  main  reasons.  Due  to  object  occlusions  in  the  scene, 
some  model  features  may  have  no  corresponding  image  feature;  some  features  are  missing. 
Because  there  are  other  objects  in  the  image,  some  image  features  will  not  correspond  to 
any  model  features;  some  features  are  spurious.  Finally,  due  to  inaccuracies  in  sensing 
and  feature  extraction,  there  is  some  uncertainty  in  the  geometry  of  image  features;  their 
measured  positions  or  orientations  may  deviate  from  the  correct  values.  These  three  fac¬ 
tors,  missing,  spurious,  and  distorted  image  features  conspire  to  make  determining  feature 
correspondences  difficult. 

In  this  paper  we  focus  on  the  central  problem  of  image  and  model  feature  matching 
in  the  presence  of  missing,  spurious,  and  distorted  features,  for  the  construction  of  pose 
hypothesis.  In  particular  we  define  a  model  of  the  geometrical  uncertainty  of  image  features, 
and  devise  a  tractable  algorithm  for  determining  all  geometrically  consistent  sets  of  feature 
correspondences  given  the  uncertainty  tolerances.  The  paper  is  organized  into  six  main 
sections.  Section  2  outlines  the  formal  basis  of  the  approach,  and  section  3  outlines  details  of 
the  computation,  and  presents  an  algorithm  for  the  construction  of  feature  correspondences, 
and  pose  hypothesis.  The  final  sections  present  some  experimental  results,  extensions  to 
this  work,  related  work,  and  the  conclusions  of  the  paper. 

2  The  Idea:  Approximate  Matching 

This  paper  considers  the  case  of  localization  of  2D  objects  given  2D  sensory  data  such  as 
grey-level  images.  The  problem  at  hand  is  to  construct  sets  of  feature  correspondences  from 
which  to  form  pose  hypotheses.  We  first  define  what  constitutes  a  feature:  the  primary 
attributes  of  the  features  we  use  for  pose  hypothesis  are  a  definite  position  and  orientation 
in  the  plane.  For  the  first  part  of  this  development  we  make  use  of  point  features,  whose 
position  characterizes  the  position  of  a  point  on  the  boundary  contour  of  an  object,  and 
whose  orientation  may,  for  example,  characterize  the  orientation  of  the  contour  normal  at 
the  point,  if  this  is  stable.  We  later  extend  the  method  to  use  line  segments. 
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Figure  1:  Shown  are  uncertainty  bounds  in  position  and  orientation.  Any  position  within  a 
distance  e  of  the  measured  position  is  possible,  as  is  any  orientation  within  6  of  the  measured 
orientation. 

2.1  The  Basic  Elements 

This  section  presents  the  basic  ideas  involved  in  our  approach  to  feature  matching. 

2.1.1  Bounded  Uncertainty 

A  key  assumption  we  make  is  that  the  uncertainty  in  the  actual  position  and  orientation 
of  an  image  feature  is  bounded.  This  is  a  common  assumption[8][4][3][7].  We  consider  two 
independent  bounds  on  the  positional  and  orientational  uncertainty.  Figure  1  illustrates 
these  uncertainty  bounds.  The  position  of  am  image  feature  represents  the  measured  po¬ 
sition  of  the  contour  point.  We  assume  the  true  position  may  deviate  from  the  measured 
position  by  a  maximum  distamce  of  e,  thus  the  read  position  falls  within  a  circle  of  radius 
e  centered  at  the  measured  position.  We  assume  the  true  orientation  may  deviate  from 
the  measured  orientation  by  a  maximum  angle  of  6,  thus  the  read  orientation  falls  'within  a 
ramge  of  orientations  of  length  26  centered  at  the  measured  orientation. 

2.1.2  Some  Notation 

We  characterize  a  tramsformation  by  three  parameters  <t>,  u,  and  v,  where  4>  is  the  angle 
of  a  rotation  about  the  origin  and  u  and  v  are  translations  in  the  x  and  y  directions, 
respectively.  In  the  2D  domain  all  positions  and  orientations  can  be  represented  by  vectors 
v  G  9?2.  Note  that  the  vector  space  C  of  complex  numbers  is  isomorphic  to  the  vector 
space  3?2.  For  much  of  the  analysis  it  will  be  convenient  to  use  complex  numbers  to 
represent  positions  and  orientations.  With  this  representation,  rotation  about  the  origin  by 
am  angle  4>  corresponds  to  multiplication  by  the  complex  exponential  el*.  If  a  =  (as,av)r 
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Figure  2:  Model  features  on  the  left,  image  features  on  the  right.  A  geometrically  consistent 
match  set  is  shown  in  figur-  3.  The  positions  are  indicated  by  a  small  dot,  and  the  orientations  by 
a  short  line  segment. 


as  a  complex  number  we  have  5  «-»  a  =  ax  ■+■  tay.  Let  the  position  of  each  model  ?  d  image 
feature  be  represented  by  pm  and  pj,  respectively.  Similarly,  the  orientation  of  the  features 
are  given  by  9m  and  0 d.  We  will  denote  a  feature  match  as  an  ordered  pair  (m,d),  where 
m  =  (p m,9m)  and  d  =  (pd,9d)-  Finally,  let  T  be  the  group  of  translations,  R  be  the 
group  of  rotations  about  the  origin,  and  the  transformation  group  be  TPS  =  T  ®  R,  their 
composition.1  We  will  let  T*  £  E  be  the  operator  for  rotation  by  <j>,  and  Tt  E  T  be  the 
operator  for  translation  by  t  =  u  +  iv.  A  transformation  T  =  Tt  o  Tj,  is  the  composition  of 
a  rotation  and  a  translation,  applied  in  that  order. 


2.2  Geometrically  Consistent  Match-Sets 

We  will  use  the  term  feature  match  or  match  to  describe  a  single  pair  of  a  model 
feature  and  an  image  feature.  An  important  fundamental  definition  is  that  of  a  feasible 
transformation  for  a  particular  feature  match,  which  is  closely  linked  to  the  notion  of  a 
geometrically  consistent  set  of  feature  matches  which  we  will  introduce  in  this  section.  Let 
T  be  an  arbitrary  transformation,  and  let  (p -  T[(pm,0m)]  represent  a  transformed 
model  feature.  For  most  T  €  TPS  we  have  jp^,  -  p<j|  >  e,  or  \9'm  -  9d\  >  6.  That  is, 
after  transformation  the  model  and  image  features  will  not  be  even  approximately  aligned 
geometrically.  But  for  some  transformations  the  two  matched  features  will  be  approximately 
aligned  geometrically.  Let  Fm,d  C  TPS  be  the  set 

FmJ  =  {T  e  TPS  :  |p^  -pd\  <  e, \0'm  -  0d\  <  M p'm,0'm)  =  T((pm,0m)]}, 

that  is,  F mtd  is  the  set  of  transformations  on  the  model  feature  m  which  leave  it  within 
e  in  position,  and  6  in  orientation  of  the  data  feature  d.  We  define  the  set  of  feasible 
transformations  for  the  feature  match  (m,  d)  to  be  the  set  Fm,j. 


'TPS  =  T  ®  R  is  the  semi-direct  product  of  the  two  subgroups. 


Figure  3:  The  model  features  transformed  and  plotted  with  the  image  features.  The  image 
features  have  uncertainty  circles  of  radius  e  around  them.  The  model  and  image  feature 
pairs  which  fall  within  e  and  S  of  one  another  form  a  geometrically  consistent  match  set. 
The  model  and  image  features  alone  are  shown  in  figure  2.  In  this  picture,  the  model 
features  from  figure  2  have  been  rotated  by  about  170  degrees  counterclockwise,  translated 
and  plotted  over  the  image  features. 


Figure  4:  Left:  the  model  and  image  features  for  a  particular  rotation,  4>o ,  of  the  model 
feature,  shown  in  position  space.  The  model  feature’s  orbit  under  rotation  is  shown  as  a 
dotted  circle.  Right:  the  match-disc  of  feasible  translations  in  translation  space  for  this 
rotation  of  the  model  feature.  Any  translation  t  =  u  +  iv  within  this  disc  will  leave  the 
rotated  model  feature  within  e  of  the  image  feature.  The  dotted  circle  is  the  orbit  of  the 
match-disc  as  a  function  of  the  rotation  of  the  model  feature.  Figure  7  shows  this  same 
rotation  along  with  two  nearby  rotations  and  the  effects  on  the  match-disc. 
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Figure  5:  Left:  Position  space,  showing  the  model  features  and  the  image  features 

near  the  correct  rotation  of  the  model  features;  both  are  plotted  together.  Right:  The 
match  circles  from  all  possible  pairings  of  the  model  and  image  features.  The  dotted  circles 
are  not  valid  match-discs  for  the  particular  rotation  because  the  difference  in  orientation 
for  the  feature  match  associated  with  these  circles  axe  are  greater  than  S.  Note  there  are 
translations  simultaneously  aligning  three  matches. 


We  will  use  the  term  match-set  to  describe  a  set  of  model  and  image  feature  matches. 
Note  that  a  given  image  or  model  feature  can  appear  in  several  different  matches  in  the 
same  match-set,  thus  the  mapping  is  not  one-to-one.  Let  M  =  {(m,,  dj)}  be  a  set  of  feature 
matches.  Such  a  match-set  is  called  geometrically  consistent  if 


n  *  0 


that  is  if  there  exists  some  transformation  which  is  geometrically  consistent  for  all  (m,-,  dj)  6  M. 
Intuitively,  the  overlapping  sets  Fm.^}  for  all  (m*, dj)  €  {m<}  x  {d,},  divide  TPS  into 
equivalence  classes  where  each  class  is  associated  with  a  different  geometrically  consistent 
match-set.  More  formally,  let  T  €  TPS  be  a  transform,  and  define  y>(T)  to  be  the  set 
{(m,-,  dj);T  €  Fmii^  }2.  The  function  <fi(T)  partitions  TPS  forming  equivalence  classes  Ek 
where  TPS  =  Ufc-Efc  and  T  =  T'  <=>  <p{T)  =  <p( T ').  The  set  {£*}  is  the  set  of  all  max¬ 
imal  geometrically  consistent  match-sets.  They’re  maximal  in  the  sense  that  for  each  of 
them  no  other  feature  match  cam  be  added  maintaining  feasibility.  Geometricadly,  Fm,j  is 
isomorphic  to  a  connected  volume  in  x  SO2  with  a  cylindrical  tube  shape.  The  union  of 
the  boundaries  of  these  volumes  divides  x  SO2  into  cells,  where  each  cell  is  isomorphic 
to  am  equivalence  class  Ek.  Figure  6  illustrates  several  overlapping  sets  Fm,d-  Note  that 
clustering  techniques  such  ais  the  Hough  tramsform  seek  to  find  cells  in  TPS  which  are 
contained  within  many  of  the  Fm,^  by  using  a  crude  quamtization  of  TPS. 

The  maun  goal  then  is  to  determine  all  maximal  geometrically  consistent  match-sets. 
Each  such  match-set  is  associated  with  a  set  of  geometricadly  consistent  transformations, 
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To  be  precise  define  F 


0 


=  TPS  -  u,„ 


1  mi  ■*}  ' 


So  for  T  €  Fo,  v(T)  =  0 
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Figure  6:  Two  views  of  transformation  parameter  space.  The  vertical  axis  is  rotation, 
4>,  and  the  other  two  are  translations  u  and  v.  Shown  are  a  number  of  geometric  silly 
consistent  transformation  volumes  for  nine  different  feature  pairs.  Each  set  of  consistent 
transformations  is  a  helical  tube,  with  extent  in  the  vertical  direction  depending  on  6  the 
single  uncertainty,  and  circular  cross  section  e  depending  on  the  positional  uncertainty. 
Slices  of  this  space  at  fixed  rotation  4>  are  shown  in  figures  11  and  12. 

and  thus  can  be  refined  to  form  an  hypothesis  on  the  pose  of  the  object  in  the  image. 
We  describe  a  method  to  enumerate  the  set  {r^(T)\T  P  TPS}  of  all  maximal  geometrically 
consistent  match-sets.  The  algorithm  is  polynomial  in  the  number  of  model  and  image 
features. 

2.3  Topological  Analysis  of  Transform  Space 

To  introduce  the  feature  matching  and  hypothesis  technique  we’ll  hret  consider  the 
case  of  simple  point  features  without  any  associated  orientation.  Thus  the  only  relevant 
uncertainty  bound  is  the  uncertainty  in  image  feature  position  e.  We’ll  then  consider  the 
case  including  orientation. 

A  crucial  conceptual  and  algorithmic  approach  will  be  to  consider  the  group  of  trans¬ 
lations,  T,  separately  from  the  group  of  rotations  R.  Define  ip{Tt,  T<$)  to  be  the  set 
{(m<,cij);T  =  T(  o  T*  e  Fm.^ }.  For  a  fixed  rotation  operator  T^,  the  function  r/>(Tt,T^) 
partitions  the  translation  group  T  where  as  above,  Tt  =  Tt'  <=>  V»(Tt,  T<p)  =  T#). 

Let  the  equivalence  classes  be  E\,  and  define  the  function  '8{T<t>)  =  {££}  to  be  the  set  of 
equivalence  classes  of  translations  Tt  for  fixed  rotation  T^.  The  function  'P(T<j)  partitions 
the  set  R  of  rotation  operators,  where  Tj,  =  T ’/  <=>  ^(T^,)  =  Denote  these  equiv¬ 

alence  classes  by  Ef.  This  partition  of  R  is  crucial  to  the  matching  approach  we  develop 
here. 


2.3.1  Feasible  Translations 

Consider  an  arbitrary  but  fixed  angle  of  rotation  <z>o*  and  a  single  feature  match  ( m.d ). 
If  the  model  feature's  position  is  rotated  by  an  angle  o>0,  the  translation  exactly  aligning 
the  two  points  is  given  by 

ttJ  =p,;  -e*°pm.. 

But  we  only  require  a  transformation  approximately  aligning  the  two  features.  We  call  any 
translation  Tt  for  which  |pj  -  Te[pmt]|  <  e  a  feasible  translation.  Define  C^iT#)  be  the  set 
of  feasible  translations  for  feature  match  ( m{,dj )  after  tne  rotation  T0  has  been  applied, 
thus 

Ci j(T0)  -  {ttJ  —  z  :  :  £  C.\:\  <  e.ttJ  =  pcij  -  T^Pm, ,}. 

For  each  feature  match,  the  set  of  feasible  translations  C tj(T^)  for  any  fixed  rotation  T&  is 
a  disc  in  translation  space;  call  this  a  match-disc,  in  this  case,  the  regions  of  translation 
space  formed  when  two  or  more  discs  have  a  common  intersection  are  the  equivalence  classes 
of  translation,  Ek.  Said  another  way,  if  we  consider  the  fixed  rotation  do  but  consider  all 
possible  matches  {(mt,«ij)}  -  {m,}x  {<fj}  and  the  set  of  associated  match-discs  {CtJ(T0o}}, 
the  function  t/>(Tt,T^,)  partitions  the  translation  space  into  equivalence  classes  E\  where 
Tt  =  Tt'  when  for  all  ( i,  j ),  Tt  6  <=>  Tt'  €  CtJ(T^ ). 

2.3.2  Topological  Boundaries 

We  have  been  considering  a  fixed  rotation  do-  We  now  consider  what  happens  to 
geometrically  consistent  match  sets  as  the  rotation  d  varies.  We  showed  that  the  center 
of  each  match-disc  CtJ  is  the  point  in  T  given  by  t,j  =  p^  -  e'*pmi.  Thus  as  d  varies, 
the  match  uisc  follows  a  circular  path  of  radius  |pm|  centered  at  the  point  pj  in  the  u-v 
plane.  See  figure  7.  As  the  rotation  d  varies  the  topology  of  intersections  of  match  discs 
Cij(T*)  in  T  changes;  that  is,  the  set  ^(7*)  =  {Elk}  changes.  This  partitions  the  rotation 
group  R.  This  is  the  key  insight  of  the  method.  By  partitioning  R,  each  equivalence 
class  Ek  is  associated  with  a  partition  of  T,  togetner  these  form  the  partition  of  the  whole 
transformation  group  TPS  yielding  the  set  of  geometrically  consistent  match  sets 

U^(T)}. 

T 

Figures  8  and  9  show  examples  of  different  intersection  topologies  as  the  rotation  d  is  varied. 
Note  that  as  yet  we  have  not  required  a  match  set  to  be  a  one-to-one  correspondence  between 
model  and  image  features.  We  will  consider  this  below. 

2.3.3  Determining  Topological  Boundaries 

The  function  =  {.££}  changes  value  when  the  topology  of  feasible  match-disc 

intersections  changes  with  variations  in  the  rotation  angle  <j>.  There  are  three  different 
events  associated  with  this  change.  The  first  case  is  when  two  match-discs  are  intersecting 
and  then  move  such  that  they  are  just  tangent,  and  then  completely  separate.  The  d  at 
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Figure  7:  Snapshots  of  position  space  (the  coordinates  in  which  we  represent  feature 
positions)  and  translation  space  (the  coordinates  in  which  we  represent  relative  translations ) 
for  three  different  rotations  applied  to  the  model  feature.  The  match-circle  in  translation 
space  is  dotted  when  the  difference  in  orientation  between  the  image  and  model  feature  is 
greater  than  the  uncertainty  bound  in  orientation. 
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Figure  8:  Three  snapshots  of  translation  space  as  match  circles  follow  their  circular  orbits 
showing  the  case  of  a  type  I  critical  rotation  due  to  a  three-way  intersection  between  three 
match  circles.  The  solid  circles  are  the  match  circles,  the  dotted  circles  are  their  orbits  as 
4>  varies.  This  illustrates  how  the  topology  of  intersections  changes  as  <f>  varies. 


Figure  9:  Three  snapshots  of  translation  space  as  match  circles  follow  their  circular  orbits 
showing  the  case  of  a  type  I  critical  rotation  due  to  tangency  condition  between  two  match 
circles.  This  illustrates  how  the  topology  of  intersections  changes  as  <f>  varies. 


which  tangency  occurs  marks  a  topological  boundary.  The  second  case  is  when  three  match- 
discs  are  mutually  intersecting,  then  move  such  that  they  still  intersect  pairwise  but  have 
no  common  intersection.  The  <(>  at  which  the  three  boundary  circles  intersect  at  a  single 
point  marks  a  topological  boundary.  Finally,  the  4>'s  at  which  two  match-discs  exactly 
coincide  marks  a  topological  boundary.  These  three  cases  are  illustrated  in  figures  8,  9  and 
10.  The  fact  that  these  are  the  only  cases  we  need  consider  is  discussed  in  the  appendix 
in  section  9.  Considering  all  pairs  and  all  triples  of  match-discs,  we  can  compute  those  <p's 
marking  topological  boundaries  in  the  rotation  space,  by  analyzing  all  occurrences  of  the 
above  three  cases.  Call  these  4>'i  Type  I  critical  rotations.  An  example  of  a  type  I  critical 
rotation  is  shown  in  figure  11.  This  figure  is  a  slice  of  constant  <p  of  the  transformation 
space  shown  in  figure  6. 

When  we  include  angle  constraints,  that  is  we  require  that  \0j  -  7^[0m]|  <  $ ,  we  in¬ 
troduce  new  topological  boundaries.  Figure  6  shows  two  views  of  TPS  including  the 
helical  tubes  forming  geometrically  consistent  transformation  sets  for  nine  different  feature 
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Figure  10:  Tliree  snapshots  of  translation  space  as  match  circles  follow  their  circular  orbits 
showing  the  case  of  a  type  I  critical  rotation  due  to  a  coincidence  condition  between  two 
match  circles. 


Figure  11:  A  sequence  of  match  circles  in  translation  space  showing  a  type  I  critical 
rotation.  These  are  slices  of  the  same  space  as  shown  in  figure  6.  Type  I  critical  rotations 
are  when,  as  match  circles  orbit  as  functions  of  <t>,  their  topology  of  intersection  changes. 
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Figure  12:  A  sequence  of  rotations  of  match  circles  in  translation  space  showing  a  type 
II  critical  point.  These  are  slices  of  the  same  space  as  shown  in  figure  6.  The  topology 
of  intersection  of  match- circles  changes  when  a  match-circle  appears  or  disappears  at  the 
extrema  of  feasible  orientation  difference  between  the  matched  features. 

matches.  The  <^’s  marking  the  ends  of  a  given  tube  are  also  rotations  where  the  topology 
of  intersections  of  match-discs  changes.  Intuitively,  a  match-disc  is  present  in  T  only  in  a 
rotation  range  in  which  the  angle  constraints  are  satisfied.  The  topology  of  T,  and  thus 
¥(2V),  changes  at  the  edge  of  the  feasible  rotation  range  for  any  feature  match,  where 
essentially  a  match-disc  vanishes  or  appears.  So  for  each  feature  match,  the  end  points  of 
the  range  feasible  rotations  also  mark  topological  boundaries.  Call  these  <p's  Type  II  critical 
rotations.  An  example  of  a  type  II  critical  rotation  is  shown  in  figure  12. 

2.3.4  Enumerating  Match  Sets 

The  critical  rotations  partition  rotation  space  forming  equivalence  classes  {E^}  associ¬ 
ated  with  constant  topology  of  translation  space.  Each  class  E *  =  [<ft >,k,<Ae,fcI  Is  a  range  of 
rotation  angles  where  and  <t>e,k  are  critical  rotations.  To  enumerate  the  match  sets  we 
use  the  fact  that  each  class  E%  is  associated  with  the  set  of  translational  equivalence  classes 
=  {££}  for  any  T#  €  E%.  Consider  one  such  equivalence  class  [</>*,,  <j>e\.  Pick  any  <pm 
where  <£<,  <  <pm  <  <j>e.  At  this  rotation  there  is  a  particular  configuration  of  match-discs  in 
translation  space.  We’ll  call  the  circle  forming  the  boundary  of  a  match-disc  a  match  circle. 
As  we  stated  above,  the  union  of  the  match  circles  divides  the  u-u  plane  into  cells,  each  of 
which  forms  a  translational  equivalence  class  Elk.  At  a  rotation  <pm  as  chosen  above,  if  they 
intersect  at  all  two  circles  intersect  at  exactly  two  points.  By  computing  all  these  intersec¬ 
tion  points  we  have  a  set  of  points  such  that  every  cell  in  the  u-v  plane  has  at  least  two  such 
points  falling  in  its  boundary.  This  gives  us  a  point  in  each  cell.  Because  <  <pm  <  <pe, 
at  4>m  there  are  no  3-way  match-circle  intersections,  circles  just  tangent  to  one-another,  or 
coincident  circles,  and  thus  each  match  circle  intersection  point  falls  in  the  interior  of  all 
the  match-circles  it  intersects  except  for  the  two  match-circles  from  which  it  was  computed, 
where  it  lies  on  the  boundaries.  The  top  and  bottom  frames  of  figure  11  illustrate  generic 
configurations,  while  the  middle  frame  shows  a  critical  rotation  delineating  two  equivalence 
classes. 
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We  can  enumerate  all  maximal  geometrically  consistent  match-sets  over  all  transforma¬ 
tions,  computing 

r*.r, 

as  follows.  For  each  equivalence  class  choose  4>m,i  so  <£(,,,  <  <pm,,  <  apply 

the  rotation  and  compute  all  match-circle  intersection  points.  For  each  intersection 
point,  determine  the  match-discs  it  falls  within,  building  up  a  match  set,  and  collec  d 
such  sets  for  all  intersection  points  and  all  rotational  equivalence  classes.  Albeit  inefficient, 
this  procedure  can  be  used  to  enumerate  the  set  of  all  maximal  match  sets. 

2.4  Summary  of  the  Idea 

We  have  a  set  of  model  features  and  a  set  of  image  features,  and  the  set  of  all  possible 
pairs  between  them.  We  wish  to  find  the  subsets  of  pairs  for  which  there  exists  some  trans¬ 
formation  on  the  model  features  simultaneously  aligning  them  with  their  matched  image 
feature,  to  within  the  uncertainty  bounds  on  position  and  orientation.  These  we  called 
geometrically  consistent  match  sets.  For  any  given  rotation  applied  to  the  model  features, 
and  for  each  feature  match,  there  is  a  disc  in  translation  space  of  feasible  translations. 
The  intersections  of  these  feasible  match  discs  imply  geometrically  consistent  match  sets. 
We  can  partition  the  set  of  rotation  angles  <f>  into  ranges  in  which  the  set  of  geometrically 
consistent  match-sets  does  not  change  as  <f>  varies  within  the  range  given  by  the  partition. 
Thus  there  is  a  finite  set  of  rotation  angles  <t>  which  need  be  considered.  For  each  such 
rotation,  there  is  a  finite  set  of  geometrically  consistent  match-sets  which  we  can  compute 
from  the  individual  feasible  matches. 

3  The  Computation 

The  previous  section  outlined  the  basic  idea  of  topological  analysis  of  transform  param¬ 
eter  space.  This  section  describes  the  idea  from  a  computational  standpoint.  We’ll  first 
describe  the  computational  components  needed,  and  then  describe  an  algorithm  for  feature 
matching. 

3.1  The  Basic  Components 

3.1.1  Partitioning  Rotation  Space 

We  now  give  the  details  of  the  computation  of  critical  rotations  associated  with  change 
in  the  topology  of  feasible  translations  in  translation  space.  There  are  three  configurations 
of  match-discs  we  need  to  consider:  when  two  match-circles  are  just  tangent,  when  three 
match  circles  intersect  at  a  point  yet  have  no  common  intersection  of  their  interiors,  and 
when  two  match-circles  exactly  coincide. 

Two  match-circles  intersect  at  only  one  point  when  their  centers  ti  and  t2  are  exactly 
a  distance  2e  apart.  In  this  case  the  centers  satisfy 


(ti-t2)(t;-t;)  =  4C2 
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Figure  13:  When  three  match  circles  intersect  at  a  point,  their  centers  fall  on  a  particular 
circle  of  radius  e,  centered  at  some  point  to- 

where  t*  denotes  the  complex  conjugate  of  t.  Thus  we  seek  the  roots  of  the  equation 

tit;  +  t2t;  -  (t;t2  +  txt;)  -  4c2  =  o. 

Recall  that  t  =  pj  -  e“*pm.  Thus  the  above  equation  can  be  viewed  as  a  function  of  el4>. 
Multiplying  the  above  equation  by  e’*  will  not  change  its  roots,  and  inspection  shows  that 
the  resulting  equation  is  quadratic  in  eiet>  and  therefore  there  are  no  more  than  two  values 
of  4>  which  satisfy  it.  If  no  unit  magnitude  complex  number  is  a  root  then  the  circles  do 
not  intersect.  Also,  any  <p  will  satisfy  it  when  all  the  coefficients  are  zero.  The  appendix  in 
section  9  gives  the  solution  to  the  above  equation  for  <t>.  There  are  mn  circles  so  there  at 
most  2(™n)  such  critical  rotations. 

Given  three  match  circles  whose  centers  are  described  by  the  complex  quantities  ti(<f>), 
t2(<£),  and  tz(4>),  we  seek  those  <t>  for  which  they  all  intersect  at  a  point.  Note  that  this  is 
equivalent  to  seeking  those  pairs  (0,  to)  for  which  ti,  t2,  and  t2  fall  on  a  circle  of  radius  e 
centered  at  some  point  given  by  to.  See  figure  13. 

This  case  is  described  by  the  following  system,  which  we  must  solve  for  4>  and  t0  = 

uo  +  ivq: 

(ti(*)-to)(t;(*)-tS)  =  e2 

(M*)-to)(t;(*)-t;)  =  e2 

(t3(*)-tO)(tS(0)-tS)  =  €2 

As  is  derived  in  the  appendix  in  section  9,  the  values  of  <t>  which  satisfy  this  system  satisfy 
the  equation 


(t2  -  ti)(t3  -  tx)(t5  -  t;)(t;  -  t;)(t;  -  t;)(t3  - 12)  + 

«2[(t;  -  t;)(t3  - 12)  -  (t2  -  tx)(t;  - 1;)]2  =  o 
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Multiplying  the  above  equation  by  e3‘*  will  not  change  its  roots,  and  inspection  shows  the 
result  is  a  6th  degree  polynomial  in  e‘*.  There  are  at  most  six  distinct  solutions  to  this, 
unless  any  <p  is  a  solution  when  all  the  coefficients  are  zero.  Note  that  since  the  above 
equation  is  purely  real,  the  coefficients  to  powers  of  e 10  and  e_'*\  respectively,  are  complex 
conjugates;  thus,  the  equation  is  a  3rd  degree  trigonometric  polynomial  in  cos (<p)  and  sin(<p). 
There  are  ("!,n)  triples  of  circles,  so  there  are  at  most  6(™n)  such  critical  rotations. 

The  third  case  where  two  match-circles  exactly  coincide  is  easily  computed  by  equating 
ti  =  t2  =  Pd,  -  e‘*pmi  =  p di  ~  £l®Pm,  and  solving  for  <f>.  There  are  at  most  (m2n) 
such  critical  rotations.  Finally,  when  we  include  constraints  on  relative  orientation  becv.een 
matched  features,  there  is  a  critical  rotation  where  the  difference  in  orientation  of  the  model 
and  image  features  is  exactly  S,  there  are  two  such  rotations  for  each  feature  match,  thus 
exactly  2mn  such  critical  rotations. 


3.1.2  Constructing  Match  Sets 

The  critical  rotations  partition  rotation  space,  and  any  equivalence  class  corresponds 
to  a  particular  intersection  topology  of  match-discs  in  translation  space.  As  before,  let  an 
equivalence  class  be  given  by  [<&,,  <f>e\  where  these  end  points  are  adjacent  critical  rotations, 
and  choose  4>m  so  that  4>t,  <  <f>m  <  4>e-  By  the  construction  of  the  critical  rotations,  all  the 
intersections  of  match  circles  at  <f>m  will  be  simply  the  two  intersection  points  of  pairs  of 
match-circles.  Given  the  locations  of  the  centers  of  two  match  circles,  tx  and  t2,  the  two 
points  of  their  intersection  are  given  by 

±  i(t,  - 1,)/3 


where 


/3  = 


(ti  -t2)(t;  -t$) 


l 
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Each  such  point  falls  within  some  number  of  match-discs,  and  on  the  boundary  of  exactly 
two,  those  two  that  generated  the  intersection  point.  Thus  each  circle  intersection  point 
borders  four  different  regions  in  translation  space.  See  figure  14.  To  construct  the  different 
match  sets  associated  with  a  given  intersection  point,  we  search  through  all  the  match-discs 
the  point  falls  within,  and  then  add  either,  neither,  or  both  of  the  generating  match  circles 
to  this  set.  Note  that  in  enumerating  the  match-sets  there  is  some  duplication  when  two 
or  more  intersection  points  border  the  same  cell  in  translation  space. 

We  cam  now  show  a  loose  upper  bound  on  the  complexity  of  feature  matching  in  this 
case.  For  each  of  the  0(m3n3)  critical  rotations  there  are  at  most  2m2n2  intersection 
points.  For  each  circle  intersection  point  we  must  potentially  look  for  containment  in  at 
most  mn  -  2  match  circles  in  order  to  construct  the  match  sets  associated  with  it.  So  a 
crude  upper  bound  on  the  complexity  of  constructing  all  maximal  geometrically  consistent 
match  sets  is  0(m3n3)  ■  2 m2n2  •  (mn  -  2)  =  0(m6n6).  This  gives  the  basic  idea,  however 
there  are  more  efficient  ways  to  construct  match  sets  which  we  will  use  in  the  algorithm 
development  below. 
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Figure  14:  Each  intersection  point  between  two  match  circles  borders  four  different  regions 
of  translation  space,  and  is  thus  associated  with  four  different  match  sets. 

3.1.3  Evaluating  Match  Sets 

Ultimately  each  match  set  forms  a  hypothesis  on  object  pose  which  needs  to  be  evaluated 
and  verified  in  some  way.  As  an  initial  step  to  this  process  we  must  determine  the  order  in 
which  the  hypotheses  will  be  verified,  in  case  we  don’t  verify  them  all.  We  seek  to  assign 
some  value  to  each  hypothesis.  The  simplest  and  most  intuitive  value  is  the  number  of 
matches  in  a  match  set.  This  value  in  a  sense  accounts  for  how  well  the  hypothesis  explains 
the  observed  image  data.  Note  that  as  yet  we  have  not  required  a  match-set  to  be  one-to- 
one.  It  is  possible  that  different  feature  matches  involving  the  same  model  or  image  features 
will  be  in  the  same  geometrically  consistent  match  set.  In  the  case  of  point  features  it  is 
likely  that  we  require  that  features  be  matched  one-to-one.  Thus  a  straightforward  count 
of  the  cardinality  of  a  match  set  would  be  misleading. 

One  reasonable  approach  to  this  problem  is  to  construct  a  one-to-one  match  set  of 
maximal  cardinality  by  eliminating  some  feature  matches  from  the  initial  match  sets.  This 
is  easy  to  do[9].  Construct  one  bipartite  graph  for  each  different  match  set.  Let  one  set 
of  nodes  represent  all  the  image  features,  and  let  the  other  set  of  nodes  represent  all  the 
model  features.  There  is  an  edge  between  two  nodes  if  the  given  feature  match  is  in  the 
match  set.  The  graph  is  bipartite  because  no  edge  connects  two  model  feature  nodes,  or 
two  image  feature  nodes.  By  finding  a  maximal  bipartite  matching  we  have  a  maximal 
one-to-one  matching  of  matches  from  the  match  set.  The  bipartite  matching  problem  takes 
O(iV2-5)[l0],  where  N  is  the  number  of  nodes  in  the  bipartite  graph.  Note  that  there  may 
be  many  one-to-one  match  subsets  of  maximal  cardinality,  and  we  can  only  find  one  with 
the  graph  approach.  However  we  only  really  seek  a  way  to  evaluate  an  hypothesis.  Any 
match  subset  implies  approximately  the  same  range  of  transformations  as  its  containing 
match  set.  The  cardinality  of  a  maximal  one-to-one  match  set  is  the  value  assigned  to  each 
hypothesis. 
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3.1.4  Some  Approximations 

Thus  far  there  is  considerable  complexity  in  the  method  as  developed.  First,  to  find 
all  critical  rotations  takes  time  0(m3n3)  because  we  may  have  to  consider  all  triples  of 
match-discs.  One  reasonable  approximation  we  can  make  is  to  only  consider  pairs  of  match 
discs  instead.  The  idea  is  the  following.  We  are  looking  for  places  where,  say,  N  match 
circles  have  a  non-empty  intersection.  By  considering  pairwise  circle  intersections,  which 
we  can  find  by  computing  type  I  critical  rotations  due  to  tangency  between  two  circles,  we 
can  find  a  range  of  rotation  where  each  of  the  N  circles  intersects  the  other  .V  -  1  circles 
pairwise.  Although  in  this  range  we  cannot  be  sure  that  the  intersection  of  all  N  match 
discs  is  non-empty  for  these  rotations,  we  will  be  very  close  to  the  rotation  where  this  is 
true.  Intuitively  this  is  because  the  errors  and  thus  the  match-disc’s  positions  are  random 
variables  and  it  is  unlikely  that  the  N  discs  will  be  randomly  arranged  so  that  they  all 
intersect  pairwise  yet  there  is  no  region  where  all  or  most  of  them  intersect.  In  section  4 
we  show  that  the  empirical  data  support  this  hypothesis. 

Second,  when  evaluating  match-sets  we  need  not  perform  a  full  bipartite  matching. 
Instead  we  can  foim  an  upper  bound  on  the  size  of  the  maximal  match  subset  as  follows. 
Determine  the  minimum  of  the  number  of  distinct  image  features  and  the  number  of  distinct 
model  features  appearing  in  a  match  set.  This  quantity  is  the  upper  bound  we  seek, 
eliminating  all  duplicate  features.  It  can  be  computed  in  time  linear  in  the  number  of 
matches  in  the  match  set.  See  [9]  for  a  detailed  discussion  of  this  idea. 


3.2  An  Algorithm  for  Hypothesis  Construction 

This  section  details  an  algorithm  for  constructing  all  maximal  geometrically  consistent 
match-sets.  We  use  the  approximation  that  only  pairs  of  match- disks  are  considered  in 
determining  critical  rotations.  In  the  actual  implementation  we  make  the  following  obser¬ 
vation:  Rather  than  choose  an  intermediate  tingle  <j>m  inside  tin  equivalence  class  at  which 
to  evaluate  all  intersecting  match  circles  we  only  look  at  regions  of  the  u-v  plane  where 
change  occurs.  That  is,  we  look  at  the  regions  containing  the  point  of  tangency  of  two 
match  circles  for  Type  I  critical  rotations  and  we  look  at  ail  circle  intersections  contained 
with  the  match  circle  associated  with  a  Type  II  critical  rotation.  The  algorithm  follows 
these  basic  steps: 

•  Form  feature  matches,  and  their  corresponding  match  circles 

•  Extremes  of  valid  rotation  for  each  match  circle  form  critical  rotations 

•  Form  pairs  of  match  circles  which  intersect  at  some  rotation 

•  Points  of  tangency  for  these  pairs  form  critical  rotations 

•  Compute  and  order  these  critical  rotation  angles 

•  Compute  the  match-sets  geometrically  consistent  at  some  initial  rotation  angle,  <t>0 

•  Step  from  <j>0  through  critical  rotation  angles  in  order 

•  At  each  new  critical  rotation,  determine  any  change  in  the  match-sets  formed 

3.2.1  Intersecting  Pairs  of  Match  Circles 

There  are  exactly  mn  feature  matches,  each  associated  with  a  match  circle  in  translation 
space  whose  center  is  given  as  a  function  of  rotation  angle  <j>.  If  we  consider  the  orbit  of 
a  match  circle  as  a  function  of  of  <j>  without  considering  constraints  on  valid  rotations  due 
to  orientation  we  have  a  circular  orbit.  Only  over  the  range  of  rotations  of  width  26  where 
the  orientations  of  the  model  and  image  features  are  within  8  of  one  another  is  the  feature 
match  feasible.  See  figure  7.  Thus,  most  match  circles  will  not  intersect  at  any  point  in 
their  orbit,  and  even  fewer  within  their  valid  range  of  rotation  angle  <j>. 

To  form  pairs  of  intersecting  match  circles,  the  simplest  approach  is  to  consider  all 
pairs  of  match-circles  and  determine  if  intersection  is  possible.  There  are  (™n)  such  pairs, 
requiring  0(m2n2)  time.  By  being  slightly  more  careful  we  can  determine  which  circles 
could  ever  possibly  intersect  without  necessarily  considering  all  pairs  of  circles.  We  can 
bound  the  region  of  the  u-v  plane  swept  out  by  each  match  circle  over  its  range  of  valid 
rotation  by  a  rectangle,  and  determine  which  rectangles  intersect.  Only  match  circles  whose 
bounding  rectangles  intersect  can  intersect  one  another.  See  figure  15.  The  intersections  of 
N  isothetic3  rectangles  in  the  plane  can  be  computed  mO{N\gN)  +  K  time[l2],  where  K  is 
the  number  of  intersecting  rectangles.  From  these  potential  pairs  the  valid  intersecting  pairs 

3  Isothetic  in  this  case  means  the  sides  of  the  rectangles  are  all  aligned  parallel' to  the  coordinate  »ri» 
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Figure  15:  The  rectangles  bounding  the  area  swept  out  by  match  circles  over  their  valid 
range  of  possible  rotations.  Inside  each  rectangle  are  the  positions  of  the  match  circle  at 
each  of  its  extremes  in  valid  rotation.  The  position  at  the  lower  rotation  angle  is  shown 
with  a  dot  in  the  center  of  the  circle. 

can  be  determined.  Of  course,  K  =  0(N2),  but  in  practice  there  are  many  fewer  rectangles 
which  actually  intersect,  so  this  first  step  can  reduce  the  computation  considerably  in 
practice. 

3.2.2  Computing  Maximal  Geometrically  Consistent  Match  Sets 

For  all  match  circles  and  all  intersecting  pairs  of  match  circles,  the  set  of  critical  rotations 
is  computed  and  sorted  modulo  2rr.  We  pick  an  arbitrary  rotation  angle  4>o  that  is  interior 
to  some  partition,  that  is,  <)>o  is  not  in  the  set  of  critical  rotations.  The  idea  will  be  to 
compute  the  geometrically  consistent  match  sets  at  0o,  and  then  step  through  the  set  of 
critical  rotations  noting  any  changes  that  occur  in  the  set  of  geometrically  consistent  match 
sets.  Changes  in  match  sets  can  only  occur  at  critical  rotation  points. 

To  start  we  compute  the  locations  of  the  match  circles  at  the  rotation  <t> o,  and  the  points 
of  intersections  of  match  circles.  As  outlined  in  section  2.3.1,  the  union  of  the  match  circles 
divide  the  u-v  plane  into  cells  associated  with  match  sets.  Each  distinct  cell  has  at  least  two 
circle  intersection  points  in  its  boundary.  From  these  intersection  points  we  can  construct 
the  match  sets  by  counting  the  number  of  circles  containing  each  intersection  point.  The 
appendix  in  section  9  describes  an  efficient  algorithm  for  determining  the  set  of  match-sets 
for  this  initial,  static  configuration  of  match  circles. 

We  construct  and  maintain  a  dynamic  data  structure  containing,  for  each  match-circle, 
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the  set  of  other  match-circles  currently  intersecting  it.  This  table  is  updated  at  each  critical 
rotation  when  the  match-sets  change.  For  a  given  circle  producing  sin  intersection  point, 
these  are  the  only  circles  that  need  be  checked  for  containment  of  the  intersection  point  in 
order  to  determine  the  associated  match-set. 

We  next  step  through  the  equivalence  classes  of  rotation  space  noting  new  geometrically 
consistent  match  sets  as  they  appear.  There  are  a  few  subtleties  that  must  be  considered. 
First  consider  two  match-circles  which  intersect  over  some  rotation  range.  Ignoring  the 
angle  constraints  there  are  two  type  I  critical  rotations  associated  with  these  two  match- 
circles.  Also,  each  match  circle  also  has  a  range  [fat,<f>ei]  over  which  it  is  valid.  The 
intersection  of  these  ranges  [<Ab, ,  <^>e, ]  0  (<Pb;,  fat\  for  two  different  match  circles  yields  the 
range  of  rotations  where  these  circles  are  both  valid.  Note  that  either  of  the  two  type 
I  critical  rotations  may  or  may  not  fall  inside  the  range  [fat , <t>ei }  f"l  [fa  ,  <t>ej\.  If  a  type  I 
rotation,  fa  is  not  in  [fa.,  fa{]  ft  [fa} ,  <t>tj  ]  then  it  is  not  a  critical  rotation  we  use  to  construct 
match  sets,  however  we  do  use  it  to  maintain  the  dynamic  list  of  currently  intersecting 
circles.  Therefore,  at  each  type  I  critical  rotation  we  add  or  delete  circles  from  the  dynamic 
intersection  sets  as  needed,  but  we  also  know  whether  or  not  any  given  match  circle  is  within 
its  range  [fa{,  4>ei]  for  the  particular  <j>  being  considered.  A  second  subtlety  is  that  when  a 
type  II  critical  rotation  occurs,  several  new  match-sets  are  formed.  All  circle  intersections 
falling  within  the  new  match-circle  appearing  must  be  considered  to  determine  the  new 
match-sets  formed.  Next  we  bring  this  all  together  in  an  algorithm. 

3.2.3  Algorithm  Summary 

Considering  all  this  we  have  the  following  procedure:  Starting  with  the  initial  intersec¬ 
tions  noted  at  fa,  step  through  the  sorted  critical  rotations,  updating  the  dynamic  circle 
intersections  lists  at  each  Type  I  critical  rotation.  If  a  Type  I  critical  rotation  associated 
with  circle  i  and  circle  j  occurs  within  [fat,  fat]  H  [fa^fa,]  then  compute  the  new  match  set 
implied  by  determining  which  circles  contain  the  single  point  of  intersection  of  the  circles  i 
and  j  for  the  critical  rotation.  Only  the  circles  intersecting  either  i  or  j  need  to  be  consid¬ 
ered.  At  each  Type  II  critical  rotation  associated  with  a  circle  i,  the  pairwise  intersection 
points  of  all  circles  intersecting  circle  i  are  computed,  and  for  each  a  new  match-set  is 
constructed  containing  the  match  for  circle  i. 

Finally,  for  each  of  the  match  sets  constructed  we  can  evaluate  them  to  provide  the 
output  of  the  hypothesis  module.  Each  match  set  was  associated  with  some  critical  rotation, 
and  one  of  the  circle  intersection  points  bounding  the  cell  in  translation  space  associated 
with  the  match  sets  provides  a  feasible  translation.  In  this  way  with  each  match  set  we  can 
associate  a  transformation  hypothesis. 

4  Experiments 

We  have  shown  that  all  maximal  geometrically  consistent  match-sets  can  be  found  by 
analysis  of  transform  parameter  space,  facilitated  by  partitioning  rotation  space  at  critical 
rotations.  We  have  outlined  and  implemented  the  full  algorithm  which  constructs  match 
sets,  as  well  as  an  approximate  algorithm.  The  full  algorithm  computes  all  critical  rotations; 
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the  approximate  algorithm  does  not  consider  the  triple  intersections  described  above  but 
instead  it  only  considers  Type  I  critical  rotations  due  to  a  tangency  condition  between  two 
match  circles  in  translation  space,  and  till  Type  II  critical  rotations. 

To  explore  the  effectiveness  of  the  approximate  algorithm  several  empirical  tests  were 
performed.  The  question  we  need  to  answer  is  how  often  will  a  correct  hypothesis  be 
missed  when  the  approximate  algorithm  is  used.  That  is,  given  that  there  are  k  of  the 
model  features  present  in  the  image  data,  how  often  will  one  of  the  hypotheses  include  all 
k  features,  and  what  fraction  of  these  k  features  will  be  found  otherwise. 

4.1  Simulations 

In  order  to  carefully  control  all  aspects  of  the  problem,  simulation  experiments  were 
conducted  in  which  synthetic  model  feature  sets  and  data  feature  sets  were  constructed. 
Here  we’ll  discuss  how  these  data  were  constructed  and  the  experiments  run. 

4.1.1  The  Simulated  Model  and  Image  Features 

Synthetic  model  and  image  features  were  constructed  in  the  following  way.  In  all  the 
experiments  point  features  were  used  consisting  of  a  position  in  the  plane  and  an  associated 
orientation.  For  all  experiments  an  “image”  size  of  256  by  256  was  assumed;  that  is  to  say 
the  coordinates  of  the  position  of  a  feature,  (z,y),  fell  in  the  range  x,y  €  [-128,128]. 
For  any  given  experimental  trial  a  total  of  m  random  points  were  generated  with  integer 
coordinates  ( x ,  y).  To  each  of  these  points  an  independent  unit  vector  of  random  orientation 
was  assigned.  This  set  of  oriented  feature  points  formed  the  model  feature  set.  Examples 
of  synthetic  model  and  image  features  are  shown  in  figure  2. 

To  construct  simulated  image  data,  a  copy  of  the  set  of  m  model  features  was  con¬ 
structed.  From  this  set,  k  of  the  m  model  features  were  chosen  at  random  and  the  remaining 
features  deleted  from  the  set  to  simulate  occlusions  and  missing  features.  For  each  remain¬ 
ing  feature  its  position  was  perturbed  by  adding  a  random  vector  of  length  l  <  (.95)e,  and 
its  orientation  was  perturbed  by  a  random  single  chosen  from  [-(.95)£, (.95)^].  Next,  an 
arbitrary  rotation  and  translation  were  applied  to  this  perturbed  set.  Finally,  to  simulate 
spurious  data  features  additional  random  features  were  added  to  complete  the  synthetic 
image  feature  set. 

4.1.2  The  Experiments 

Because  the  image  and  model  features  were  constructed  synthetically,  it  was  possible 
to  determine  which  matches  were  correct.  Each  experimental  run  determined  whether 
one  of  the  hypothesized  geometrically  consistent  match  sets  contained  all  k  correct  feature 
matches  among  their  matches,  where  k  is  the  number  of  model  features  actually  appearing 
among  the  image  features.  To  demonstrate  the  effectiveness  of  the  complete  version  of  the 
algorithm  in  which  the  triple-circle  intersection  case  was  considered,  the  following  exper¬ 
iment  was  run.  m  =  15  random  model  features  were  generated,  from  which  k  =  11  were 
randomly  selected,  independently  perturbed  in  position  and  orientation,  and  added  to  15 
other  random  features.  This  set  of  n  =  26  data  features  was  then  rotated  and  translated 
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by  an  arbitrary  amount.  This  formed  a  model  set  of  15  features,  and  a  data  set  of  26 
features,  of  which  11  corresponded  to  the  model.  This  process  was  repeated  in  1000  inde¬ 
pendent  trials  using  e  =  8  and  5  =  12  degrees,  generating  completely  new  random  model 
and  data  features  each  trial.  When  all  critical  rotations  were  computed,  including  triple 
intersections,  a  correct  match-set  of  size  11  was  found  in  all  1000  trials. 

Because  the  existence  of  spurious  features  only  adds  to  the  number  and  size  of  match 
sets,  but  does  not  affect  what  fraction  of  the  k  model  features  appear  in  an  hypothesized 
match  set,  for  the  following  experiments  there  were  no  deleted  or  spurious  features.  The  fol¬ 
lowing  experiments  test  the  degree  to  which  the  approximate  algorithm  actually  constructs 
all  maximal  geometrically  consistent  match  sets. 

4.1.3  The  Effect  of  Model  Feature  Count 

The  first  experiment  explored  the  effect  of  increasing  the  number,  m  of  model  fea¬ 
tures.  For  these  experiments  there  were  n  =  m  image  features.  For  each  value  of  m  6 
{3,6,9,12,15,18,21,24},  10,000  independent,  completely  random  experiments  were  per¬ 
formed.  That  is  to  say  for  each  value  of  m,  10,000  different  random  model  feature  sets 
were  generated,  and  from  each  of  these  the  random  image  features  were  constructed  by 
applying  an  independent  random  perturbation  to  each  model  feature,  and  then  applying  a 
rigid  random  rotation  and  translation  to  this  set,  independent  from  all  other  trials.  The 
results  of  this  experiment  are  summarized  in  two  figures.  Figure  16  shows  how  often  the 
approximate  algorithm  failed  to  find  a  match-set  with  m  correct  feature  matches  in  it,  i.e. 
a  correct  hypothesis.  Shown,  for  each  value  of  m,  is  the  fraction  of  10, 000  trials  where  a 
match-set  containing  m  correct  matches  was  not  constructed. 

Importantly,  out  of  these  80,000  independent  random  experiments,  a  correct  hypothesis 
was  constructed  in  all  but  7287  trials,  of  these,  in  6970  trials  the  largest  correct  match-set 
was  of  size  m  -  1;  in  313  trials  the  largest  correct  match-set  was  of  size  m  -  2,  and  in 
only  4  trials  was  the  largest  correct  match- set  of  size  m  -  3.  This  means  that  using  the 
approximate  algorithm,  in  99.6%  of  the  80,000  trials  a  correct  match- set  of  size  m  or  m  -  1 
was  constructed.  Figure  17  shows  the  average  over  10,000  trials  of  the  fraction  of  the  model 
correctly  matched  in  the  largest  correct  match-set,  for  each  value  of  the  model  size  m. 

4.1.4  The  Effect  of  Position  Error  and  Uncertainty 

The  second  experiment  explored  the  effect  of  varying  the  amount  of  error  in  position 
which  was  introduced  when  constructing  synthetic  image  features  sets,  and  correspondingly 
varying  the  uncertainty  bound  assumed  in  the  algorithm.  For  this  experiment  a  range  of 
integer  position  uncertainty  bounds  e  G  {2,4,6,8,10,12,14,16,18,20}  was  used,  and  the 
error  introduced  in  the  synthetic  image  features  was  (.95)e.  For  each  value  of  e,  1000  in¬ 
dependent  trials  were  conducted,  using  9  model  features  and  9  synthetic  image  features. 
Figure  18  shows  how  often  the  approximate  algorithm  failed  to  find  a  match-set  with  9 
correct  feature  matches  in  it.  i.e.  a  correct  hypothesis.  Shown,  for  each  value  of  e,  is  the 
fraction  of  1000  trials  where  a  match-set  containing  9  correct  matches  was  not  constructed. 
Out  of  1000  trials,  there  were  851  trials  where  a  correct  match-set  of  size  9  was  not  con- 
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Figure  16:  For  each  value  of  m,  the  number  of  model  features,  shown  is  the  fraction  of  10,000 
trials  where  a  match-set  containing  m  correct  matches  was  not  found,  although  one  or  more  existed. 


structed,  of  which  828  constructed  match-sets  of  size  8,  and  23  constructed  match-sets  of 
size  7. 


4.1.5  The  Effect  of  Orientation  Error  and  Uncertainty 

The  third  experiment  explored  the  effect  of  varying  the  amount  of  error  in  orientation 
which  was  introduced  when  constructing  synthetic  image  features  sets,  and  correspondingly 
varying  the  uncertainty  bound  assumed  in  the  algorithm.  For  this  experiment  a  range  of 
integer  orientation  uncertainty  bounds  6  £  {2,4,6,8,10,12,14,16.18,20}  (in  degrees)  was 
used,  and  the  error  introduced  in  the  synthetic  image  features  was  (.95)£.  For  each  value 
of  6 ,  1000  independent  trials  were  conducted,  using  9  model  features  and  9  synthetic  image 
features.  Figure  19  shows  how  often  the  approximate  algorithm  failed  to  find  a  match-set 
with  9  correct  feature  matches  in  it,  i.e.  a  correct  hypothesis.  Shown,  for  each  value  of 
6,  is  the  fraction  of  1000  triads  where  a  match-set  containing  9  correct  matches  was  not 
constructed.  Out  of  1000  trials,  there  were  735  triah  where  a  correct  match-set  of  size  9 
was  not  constructed,  of  which  720  constructed  match-sets  of  size  8,  and  15  constructed 
match-sets  of  size  7. 

4.2  A  Real  Example 

Finally,  as  an  example  of  a  real  application,  and  of  running  the  algorithm  on  a  much 
more  complicated  case,  features  were  derived  from  real  images.  For  both  the  model  features 
and  the  image  features,  the  features  were  derived  by  starting  with  a  grey-level  image  of  the 
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Figure  17:  For  each  value  of  m ,  the  number  of  model  features,  shown  is  the  average  over  10,000 
trials,  of  the  fraction  of  the  model  features  correctly  matched  in  the  match  set  with  the  largest 
number  of  correct  matches. 

object  and  a  scene,  applying  an  edge  detector,  and  sampling  the  resulting  edge  contours 
at  regular  intervals,  forming  features  out  of  the  sample  point  positions  and  the  contour 
normal  at  that  point.  Of  course,  there  are  possibly  much  better  ways  to  derive  point 
features  from  edge  contours,  but  this  method  demonstrates  our  point  very  well.  Figure 
20  shows  the  image  from  which  the  image  data  were  derived.  Figure  22  shows  the  edge 
contours  for  both  the  model  and  the  input  data  image,  and  figure  21  shows  the  model  and 
image  point  features  derived  Horn  the  edge  contours.  Figure  20  shows  an  example  of  one  of 
the  hypotheses  constructed.  The  model  contour  is  transformed  according  to  the  hypothesis 
and  plotted  over  the  image  contours.  In  this  case  there  were  41  model  features  and  161 
image  features,  and  the  largest  match-set  had  27  matches.  This  is  the  match  set  we  have 
displayed.  This  illustrates  that  we  can  use  ♦he  method  to  derive  robust  pose  hypotheses 
from  complex  images. 

4.3  Discussion  of  the  Experiments 

The  most  important  conclusion  we  can  draw  from  these  experiments  is  that  the  approx¬ 
imate  algorithm,  which  only  considers  the  interaction  of  pairs  of  match-circles,  works  nearly 
as  well  as  the  complete  algorithm.  Over  a  broad  range  of  model  size  m,  and  the  uncertainty 
parameters  e  and  6 ,  all  image  features  were  correctly  matched  to  model  features,  except  for 
one  or  two.  When  the  model  is  reasonably  large,  say  10  features  or  more,  almost  ail  the 
model  will  be  correctly  matched. 

The  experimental  results  can  be  explained  in  terms  of  the  interactions  of  match-circles. 
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Figure  18:  For  the  case  of  9  model  features  and  9  image  features,  for  each  value  of  the  position 
error  introduced  and  the  uncertainty  assumed,  we  plot  the  fraction  of  1,000  trials  where  a  match-set 
containing  9  correct  matches  was  not  found,  although  one  or  more  existed. 


In  the  experimental  results  shown  in  figure  16,  the  fraction  of  cases  with  suboptimal  per¬ 
formance  rises  with  the  model  size  m.  This  makes  sense  in  retrospect.  Imagine  the  u-v 
plane  when  the  correct  rotation  has  been  applied.  There  is  a  region  of  intersection  of  all 
m  correct  match  discs.  As  we  vary  <p  and  some  disc  moves  off  of  this  region  of  intersection, 
the  more  match-circles  involved,  the  more  likely  it  is  to  cross  over  the  intersection  of  two 
other  circles,  rather  than  crossing  only  a  single  circle.  In  the  case  shown  in  figure  18,  the 
larger  the  match-circles,  the  less  likely  many  of  them  will  actually  bound  the  region  of 
maximal  intersection,  rather  than  just  containing  it.  Thus  as  some  circle  moves  off  there 
are  fewer  circle  intersections  for  it  to  cross.  Finally  in  figure  19  we  see  that  the  type  II 
critical  rotations  reduce  the  need  to  look  for  type  I  critical  rotations.  The  larger  the  angle 
uncertainty  6,  the  more  circles  interact  with  type  I  intersections,  and  thus  the  more  likely 
we  will  need  to  consider  a  three-way  intersection. 

5  Extensions:  Line  Segments  as  Features 

Many  existing  recognition  approaches  use  straight  line  approximations  to  contours.  One 
advantage  of  this  approach  is  a  great  reduction  in  the  number  of  features  to  process.  The 
approach  taken  in  this  paper  is  not  limited  to  point  features,  but  applies  equally  well  to 
features  with  finite  extent  such  as  line  segments.  First  consider  the  case  where  the  model 
is  composed  of  straight  line  segments  and  the  image  data  are  oriented  point  features  as 
before.  This  is  closely  related  to  the  case  where  both  the  model  and  data  are  composed  of 
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Figure  19:  For  the  case  of  9  model  features  and  9  image  features,  for  each  value  of  the  orientation 
error  introduced  and  the  uncertainty  assumed,  we  plot  the  fraction  of  1,000  trials  where  a  match-set 
containing  9  correct  matches  was  not  found,  although  one  or  more  existed. 

line  segments  if  we  subtract  the  length  of  the  image  segment  from  the  model  segment  and 
treat  the  image  segment  as  a  point. 

As  before  we  will  assume  the  uncertainty  in  position  and  orientation  are  independent. 
As  in  the  point  feature  case,  for  a  given  feature  match  only  those  rotations  leaving  the 
orientation  of  the  two  features  within  6  of  one  another  are  feasible,  where  the  orientation 
of  a  line  segment  is  given  by  a  unit  perpendicular  vector.  In  the  case  of  position,  consider 
the  vector  from  the  center  of  the  line  segment  to  the  image  point.  See  figure  23.  For 
convenience  we  will  define  the  range  of  feasible  translations  to  be  any  translation  leaving 
the  perpendicular  component  (relative  to  the  segment)  of  this  vector  less  than  e,  and  the 
tangential  component  less  than  ^  +  e  where  l  is  the  length  of  the  model  line  segment.  Thus 
the  range  of  feasible  translations,  for  any  particular  rotation,  is  a  rectangular  box. 

Let  pm  represent  the  midpoint  of  a  model  line  segment.  The  center  of  a  match-rectangle 
in  translation  space  is  given  as  before  by 

u  +  iv  =  t  =  pj  -  e'*pm 

where  pj  is  the  position  of  the  image  point.  The  match  rectangle  can  be  constructed  by 
centering  a  rectangle  around  t  whose  sides  are  parallel  and  perpendicular  to  the  orientation 
of  the  rotated  model  segment,  el<t,9m. 

As  we  vary  the  rotation  applied  to  a  model  segment,  the  rectangle  of  feasible  translations 
rotates  around  the  point  Pd  exactly  as  in  the  case  described  earlier.  If  we  consider  the 
translation  space  rectangles  for  all  feature  matches  at  any  fixed  rotation  <j>,  the  intersection 
topology  of  the  rectangles  defines  geometrically  consistent  match  sets  as  before.  Again  the 
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Figure  20:  A  teal  example.  On  the  left  is  displayed  the  leading  hypothesis.  The  model  contour 
has  been  transformed  and  plotted  over  the  image  contours,  indicating  the  pose  hypothesized.  On 
the  right  is  the  input  image  containing  the  object  of  interest. 


idea  is  to  partition  the  space  of  rotations  into  ranges  within  which  the  intersection  topology 
of  translation  space  is  unchanged.  As  in  the  case  with  match- circles,  the  topology  changes 
whenever  two  rectangles  begin  to  intersect,  and  when  three  rectangles  intersect  at  a  single 
point,  but  have  no  common  intersection.  See  figure  24. 

Two  rectangles  begin  to  intersect  when  one  of  the  vertices  of  one  lies  on  a  line  segment 
of  the  other.  We’ll  parameterize  a  line  by  a  perpendicular  unit  vector  n,  and  perpendicular 
distance  p  from  the  origin  to  the  line.  The  orientation  n  is  given  by  n  =  -  sin#  +  i  cos0, 
where  0  is  the  positive  angle  from  the  x  axis  to  the  line.  The  equation  for  a  line  is  then 
x  •  ri  =  p  in  vector  notation,  or  xn*  +  x*n  =  2p  in  our  complex  number  representation. 
The  equation  (x  -  xo)no*  +  (x*  -  xo*)n  =  2po  represents  a  line  of  the  specified  orientation 
whose  perpendicular  distance  to  the  point  xo  is  po-  The  equation  (x  -  xo)e-**no*  +  (x*  - 
x0*)e**n0  =  2 po  represents  the  same  line  rotated  about  xo  by  an  angle  <f>.  In  this  way 
we  can  represent  the  position  of  a  match-rectangle  translation  space  by  parameterizing  its 
component  line  segments. 

Two  rectangles  intersect  in  the  manner  of  interest  when  a  vertex  of  one  rectangle  lies 
in  a  line  segment  of  another.  Let  the  position  of  a  vertex  in  translation  space  be  given  by 

(V  -  P  d,V*  +  Pd, 

where  is  the  center  of  rotation  of  the  match- rectangle  in  translation  space,  and  v  is  the 
position  in  the  u-v  plane  of  a  vertex  when  0  =  0.  Let  the  infinite  line  containing  a  line 
segment  of  a  match-rectangle  be  given  as  above  by 

(x  -  p<£  )e-"*nj  +  (x*  -  p^'Je’^nj  =  2 p,- 
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Figure  21:  Model  features  on  the  left,  image  features  on  the  right.  These  features  were  derived 
from  real  images  as  shown  in  figure  20 


the  point  lies  on  the  line  when  it  satisfies  the  equation  for  the  line  so  we  substitute  the  first 
equation  for  x  in  the  second  equation.  Let  v;-  =  (v  -  pdj ).  Substituting  for  x  we  get 

(vj€l*  +  pd]  -  P*  *  -  pd‘)el*n {  =  2pt 

Note  that  all  terms  occur  as  complex  conjugate  pairs,  so  this  equation  is  a  real  equation  of 
the  form  a  cos  <£-/3  sin  <j>  =  £.  This  can  be  solved  by  the  equations  <j>  = 

where  rb  =  tan-1  Once  we  have  solved  for  the  6  where  the  vertex  lies  in  the  infinite 

at 

line  containing  the  line  segment,  we  must  check  if  it  actually  falls  in  the  line  segment.  For 
a  pair  of  match-rectangles,  this  check  is  done  for  all  pairs  of  vertices  from  one  rectangle 
to  line  segments  of  the  other.  We  must  also  check  that  at  the  resulting  4>  the  interior  of 
the  two  rectangles  have  no  common  intersection,  otherwise  this  is  not  a  critical  rotation  in 
rotation  space  for  these  two  rectangles. 

In  the  case  of  a  three-way  rectangle  intersection  we  solve  for  the  intersection  at  a  common 
point  for  three  lines.  Let  the  three  equations  be  for  i  =  1,2, 3 

(x  -  P*)e-,*nJ*  +  (x*  -  p**)e‘*ni  =  2 pi. 

We’re  solving  for  some  x  satisfying  these  equations  for  all  3  lines.  Let  y  =  xe_K'  giving 
ynj*  -  p^e'^nj*  -I-  y*nj  -  p^’e’^nj  =  2 pi.  The  above  system  is  three  equations  in  three 
unknowns,  two  components  of  x  and  <f>.  Using  gaussian  elimination,  eliminate  y  and  y* 
from  the  equations.  Solve  the  resulting  quadratic  for  e substitute  back  into  the  original 
equations  and  solve  for  x.  For  each  triple  of  match  rectangles,  all  triples  of  line  segments, 
one  from  each  rectangle,  must  be  considered.  Finally,  each  intersection  point  solved  for 
must  lie  inside  each  line  segment,  not  just  on  its  infinite  extension;  also  at  the  resulting  <j>, 
the  interiors  of  the  rectangles  must  have  no  common  intersection. 
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Figure  22:  The  edge  contours  derived  from  the  model  image  and  the  input  data  image 

These  two  cases  serve  to  find  all  critical  rotations  in  rotation  space  where  topological 
changes  occur  in  translation  space.  For  each  such  critical  rotation  point,  the  intersection 
topology  of  rectangles  in  translation  space  must  be  examined  to  find  the  individual  cells, 
each  associated  with  a  different  match  celL  These  are  found  by  finding  the  intersection 
points  of  till  the  line  segments  making  up  the  rectangles.  The  rest  of  the  procedure  is 
analogous  to  the  case  of  point  features  and  match-circles. 

It  would  be  interesting  to  extend  this  idea  to  uncertainty  cross  sections  of  arbitrary 
convex  polygons  or  curves.  It  may  also  be  possible  to  have  a  multi-tiered  uncertainty 
region,  that  is,  cross  sections  that  consist  of  several  “concentric”  convex  closed  curves, 
possibly  weighted  according  to  their  likelihood.  This  would  be  a  discrete  approximation  to 
a  continuous  uncertainty  probability  function. 

6  Related  Work 

There  has  been  a  great  deal  of  work  on  object  identification  and  localization,  particularly 
in  the  domain  of  20  objects  and  sensory  data.  Some  examples  include  [ll][4][2][8j[6].  The 
work  most  relevant  to  this  paper  is  that  which  relies  on  determining  the  correspondences 
between  local  geometrical  features  derived  from  both  the  model  and  sensory  data.  Among 
these  we  consider  here  those  approaches  which  explicitly  account  for  error  in  the  sensory 
data. 

For  the  purposes  of  comparison  let’s  state  the  important  points  about  our  approach. 
We  utilize  local  geometric  features  characterized  by  a  position  and  an  orientation.  We 
assume  that  there  is  uncertainty  in  the  measured  geometry  of  a  feature  derived  from  the 
sensory  data.  We  further  assume  this  uncertainty  can  be  bounded.  By  analysis  of  the 
space  of  transformation  parameters,  we  can  construct  all  maximal  geometrically  consistent 


Figure  23:  The  position  uncertainty  for  a  line  segment  feature  extends  a  distance  e  in  the  direction 
perpendicular  to  the  measured  segment,  and  a  distance  5  +  e  from  the  measured  center  point  in  the 
direction  parallel  to  the  line  segment.  Above  is  shown  the  uncertainty  in  position  space,  below  is 
shown  the  region  of  feasible  translations  in  translation  space. 


match-sets  in  polynomial  time4.  Each  match-set  forms  a  globally  consistent  interpretation 
of  the  sensory  data  in  terms  of  the  model  features.  Note  that  without  any  approximation, 
these  geometrically  consistent  match-sets  can  be  found  in  time  polynomial  in  the  number 
of  model  and  image  features.  The  asymptotic  runtime  of  the  algorithm  is  independent  of 
the  geometry  of  any  particular  set  of  model  or  image  features. 

Grimson  and  Lozano-Perez[8]  have  developed  a  recognition  system  for  both  2D  and 
3D  objects  they  call  RAF.  The  features  they  use  are  oriented  line  segments  in  the  2D 
case.  They  assumed  independent  bounds  on  positional  and  orientational  uncertainty.  In 
RAF  the  space  of  feasible  sets  of  model  and  image  feature  pairs  is  explored  sequentially, 
formulated  as  a  tree  search.  Each  path  through  the  tree  from  root  to  leaf  forms  an  element 
of  the  power  set  of  feature  pairs,  where  each  node  in  the  tree  corresponds  to 

a  particular  feature  pair.  Large  sections  of  the  tree  can  be  pruned  away  by  considering, 
for  each  pair  of  feature  matches,  the  intersection  of  their  sets  of  geometrically  consistent 
transformations.  If  any  paur  of  matches  is  not  a  geometrically  consistent  match  set,  then 
the  entire  path  rooted  there  can  be  ignored. 

Empirically,  the  algorithm  is  quite  robust,  accurately  finding  objects  in  cluttered  scenes 
given  inaccurate  sensory  data.  However  because  the  algorithm  is  inherently  exponential  in 
the  number  of  image  features,  several  heuristics  are  used  to  make  it  tractable  in  practice. 
There  are  two  main  heuristics  employed.  One  involves  grouping  feature  matches  into  subsets 
via  a  Hough  Transform  form  of  parameter  hashing  and  exploring  the  tree  restricted  to 
some  of  these  subsets.  The  other  involves  an  early  cutoff  of  the  tree  search  when  a  set  of 
pairwise-consistent  matches  that  is  deemed  good  enough  is  encountered.  The  use  of  these 
two  heuristics  means  that  the  entire  space  of  match-sets  is  not  explored.  Finally,  when  a 

'This  assumes  that  the  solutions  to  the  sixth-degree  polynomials  is  a  constant  time  operation.  However, 
as  noted  earlier,  the  approximate  algorithm  is  very  good,  and  the  required  quantities  can  be  computed 
analytically. 
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Figure  24:  The  events  associated  with  topological  change  in  translation  space  are  two-way  and 
three  way  intersections  of  rectangles  as  shown  here. 


pairwise  geometrically  consistent  match-set  is  constructed  an  averaging  method  is  used  to 
derive  a  transformation  from  the  set  of  feature  matches.  This  is  then  applied  to  determine  if 
in  fact  most  feature  matches  are  approximately  aligned.  Note  that  this  does  not  determine 
if  there  exists  a  transformation  simultaneously  aligning  all  matches  modulo  uncertainty,  i.e. 
that  its  a  geometrically  consistent  match-set  in  our  terminology.  One  difficulty  with  this 
search  technique  is  that  it  is  difficult  to  determine  whether  or  not  there  are  more  instances 
of  the  object  to  be  found  after  the  first  few  are  found.  In  fact  if  there  are  no  instances  it 
takes  considerable  search  to  answer  negatively. 

The  principle  advantages  of  our  work  over  RAF  are  that  it  is  worst  case  polynomial  in 
the  number  of  features,  that  it  finds  all  maximal  geometrically  consistent  match  sets,  and 
that  these  are  by  construction  globally  consistent.  In  the  full  case  there  is  no  approximation 
involved:  all  feasible  matches  within  the  uncertainty  bounds  are  found.  In  particular,  the 
hypothesis  step  hypothesizes  all  possible  instances  whether  or  not  the  object  is  present.  In 
fact,  if  no  object  is  present  the  computation  is  easier  because  in  this  case  fewer  match-circles 
will  have  common  intersections.  Note  that  one  major  difference  in  the  two  approaches  is  that 
RAF  uses  the  more  practical  line  segment  features,  while  our  system  uses  point  features. 
In  section  5  we  outline  how  the  approach  can  be  generalized  to  line  segment  features,  thus 
making  the  problems  solved  by  the  two  systems  largely  equivalent. 

Ellis(7]  considers  a  special  case  of  the  method  we  discuss  here.  He  assumes  a  model  com¬ 
posed  of  line-segment  features  and  data  consisting  of  oriented  point  features.  He  assumes 
an  uncertainty  in  position  of  magnitude  e,  and  an  independent  uncertainty  in  orientation 
of  magnitude  6  for  the  data  features.  Given  a  set  of  corresponding  model  and  image  fea¬ 
ture  pairs  and  uncertainty  bounds  he  shows  how  to  find  the  range  of  rotations  on  the  data 
features  which  leaves  them  within  e  of  their  paired  model  feature  (or  really  within  e  of  the 
infinite  line  containing  the  line  segment)  and  the  difference  in  orientation  of  paired  features 
within  6  of  one  another.  Then,  given  some  rotation  0  of  the  data  features,  he  shows  how 
to  determine  the  range  of  translations  on  the  data  features  allowed  within  the  uncertainty 
in  position. 
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The  important  part  of  this  work  is  that  he  defines  a  model  of  uncertainty,  and  show  how 
to  localize  an  object  with  careful  attention  to  the  uncertainty.  Ellis  is  considering  a  different 
problem  where  the  correspondences  are  already  known,  which  in  a  sense  is  a  special  case  of 
the  general  matching  problem.  Using  our  approach  to  the  analysis  of  transformation  space 
we  could  accomplish  the  same  task  of  localizing  the  object  to  within  uncertainty  bounds 
given  the  true  correspondences.  We  would  simply  find  the  range  of  rotation  4>  within  which 
all  correct  match-circles  intersect  in  a  common  region. 

In  the  field  of  computational  geometry,  Alt  et.al.[l]  describe  a  method  for  computing 
approximate  congruence  of  two  point  sets  of  matched  cardinality,  allowing  rigid  transfor¬ 
mations  of  the  plane.  This  means  given  two  points  sets,  A  and  B,  of  equal  cardinality, 
find  if  there  exists  a  one-to-one  correspondence  between  them  such  that  there  exists  a  rigid 
transform  of  the  points  of  set  B  bringing  them  within  the  e  neighborhood  of  their  matched 
point  from  set  A.  They  solve  a  very  similar  problem  to  ours.  We  might  consider  A  to  be 
image  features,  and  B  to  be  model  features.  They  don’t  consider  the  case  where  A  contains 
a  only  subset  of  B,  as  well  as  additional,  unrelated  points.  They  also  determine  only  if  at 
least  one  approximate  congruence  exists,  instead  of  finding  all  approximate  congruences  of 
all  sizes,  as  we  do.  They  also  do  not  consider  angle  uncertainty  constraints  as  we  do.  It 
seems,  however,  that  the  modifications  to  their  algorithm  required  to  handle  these  cases 
is  small.  Thus  the  algorithm  they  outline  could  very  likely  be  used  to  solve  our  matching 
problem  in  the  case  where  angle  constraints  are  not  used.  Their  approach  differs  from 
ours  in  that  they  analyze  the  image  space,  instead  of  the  transformation  space.  There  are 
two  possible  advantages  of  our  approach  over  theirs.  First,  it  is  clear  how  to  extend  our 
approach  to  extended  features  and  polygonal  uncertainty  regions.  Second,  our  method  can 
be  extended  to  higher  dimensional  transformations,  such  as  including  scale,  while  it  is  not 
clear  how  to  extend  their  approach  in  these  cases. 

Baird[3]  describes  a  method  of  matching  features  under  uncertainty  based  upon  linear 
programming.  He  assumes  bounds  on  position  uncertainty  of  image  features,  and  outlines 
an  expected  polynomial  time  algorithm  for  matching.  An  important  limitation  of  this 
work  is  that  the  size  of  the  model  and  image  feature  sets  is  the  same,  that  is,  he  does  not 
allow  missing  or  spurious  features.  He  indicates  that  considering  these  cases  substantially 
increases  the  complexity  of  the  matching  process. 

The  technique  of  transformation  $ampling[b ]  was  the  developmental  ancestor  to  the 
analytic  approach  developed  here.  Rather  than  determine  analytically  all  different  sets  of 
transformations,  the  space  of  transformations  was  sampled  in  hopes  that  a  sample  point 
would  fall  in  each  distinct  feasible  region  of  transformation  space.  The  approach  described 
here  is  also  introduced  briefly  in  [5j. 

The  Hough  transform  clustering  techniques[l3]  are  really  a  crude  approximation  to  our 
analytic  approach.  Again  the  idea  is  to  determine  where  sets  of  feasible  transformations 
intersect  in  transform  space.  But  a  coarse  quantization  is  utilized  to  find  such  intersections, 
leading  to  considerable  inaccuracy. 
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7  Conclusions 

In  summary  we  have  shown  a  feature  matching  and  pose  hypothesis  technique  that 
requires  time  polynomial  in  the  number  of  features,  and  is  provably  correct  and  complete 
for  a  certain  class  of  local  geometric  features  and  precise  models  of  geometric  uncertainty. 

We  defined  a  maximal  geometrically  consistent  match  set  as  a  set  of  model  and  image 
feature  pairs  for  which  there  is  some  transformation  which  aligns  the  features  in  each  match 
to  within  the  bounds  on  geometric  uncertainty,  and  such  that  at  that  transformation  it  is 
not  possible  to  add  more  feature  matches  to  the  set  and  maintain  feasibility.  We  showed  that 
it  is  possible  to  compute  all  such  geometrically  consistent  match-sets  in  time  polynomial  in 
the  number  of  features. 

One  important  implication  is  that  a  complete  and  precise  solution  to  the  problem  of 
feature  matching  in  the  presence  of  uncertainty  is  of  polynomial  complexity.  If  the  feature 
matching  were  exact,  i.e.  there  where  no  error  then  there  are  simple  polynomial  algorithms 
to  construct  geometrically  consistent  match  sets.  However  if  only  approximate  matching 
is  required  then  these  simple  algorithms  cannot  be  proven  to  find  all  possible  match-sets. 
Existing  approaches  which  explicitly  deal  with  geometric  uncertainty  as  carefully  as  we  do 
have  been  of  worst  case  exponential  complexity.  Thus  our  analysis  provides  an  important 
theoretical  understanding  of  the  matching  problem. 

This  approach  to  matching  is  intended  as  a  crucial  part  of  the  hypothesis  stage  of  a 
recognition  system.  We  have  not  addressed  the  fact  that  given  a  geometrically  consistent 
match  set,  the  actual  transformation  implied  can  be  refined  by  optimizing  some  difference 
metric  over  the  set  of  matches.  Once  high  quality  hypotheses  are  generated,  the  final  step 
is  to  verify  them  using  possibly  richer  representations. 

While  the  2D  matching  case  described  here  is  interesting,  of  greater  interest  is  gaining 
a  more  thorough  understanding  of  the  matching  problem  with  geometric  uncertainty  in 
higher  dimensional  problems  such  as  matching  3D  models  to  2D  image  data,  or  3D  models 
to  3D  range  data.  Our  approach  has  important  implications  in  these  cases.  Our  analysis  of 
transform  space  events  goes  beyond  this  simple  case  and  can  also  be  applied  to  these  higher 
dimensional  problems.  An  approach  based  on  the  ideas  in  this  paper  has  been  developed 
for  the  2D  case  of  rotation,  translation  including  scale  variation,  and  work  is  in  progress 
considering  matching  3D  models  to  2D  and  3D  data. 
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9  Appendix 

9.1  Solving  three-way  intersections 

Given  three  match  circles  whose  centers  are  described  by  the  complex  quantities 
1 2(0),  and  1 3(0),  we  seek  those  4>  for  which  they  all  intersect  at  a  point.  Note  that  this  is 
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equivalent  to  seeking  those  pairs  (<p,  to)  for  which  ti,  t2,  and  t3  fall  on  a  circle  of  radius  e 
centered  at  some  point  given  by  to-  See  figure  13.  This  case  is  described  by  the  following 
system,  which  we  must  solve  for  <j>  and  to  =  uo  +  ivo: 

(M*)  -  to)(W)  -  tS)  =  e2 
(ta(^)  -  to)(t5(^)  -  t;)  =  e2 
(t3(0)  -  t0)(t;w  -  tS)  =  e2 

By  expanding  then  subtracting  any  two  of  the  above  equations  we  get 

M*  -  t,t*  -  [t0(t*  -  t*)  +  t5(ti  -  t;)]  =  o 


From  this  we  construct  the  linear  system: 


(t;  -  tj)  (t,  -  tj)  t0 

(*s  -  tj)  (t3  - 12)  ts 


Mi-Mi 
Mi  -  tats 


The  determinant  of  the  above  matrix  is  (tj  -  t J)(t3  -  t2)  -  (t2  -  t2)(ti  -  tj)  which  is 
non-zero  when  tl5  t2,  and  t3  are  not  colinear,  thus  the  solution  of  this  system  is  the  unique 
circle  of  some  radius  on  which  t2,  t2,  and  t3  fall.  The  case  were  the  determinant  is  zero 
results  when  the  three  points  lie  on  a  circle  of  infinite  radius.  We  can  solve  this  for  to  using 
Cramer’s  rale: 


to  = 


(Mi-MD  (t2-to 

(Mi  -  Mj)  (t3-t2) 

(tj  —  ti)  (t2  —  tj) 
(t3  -  ti)  (t3  —  t2) 


which  yields 


to  = 


(Mj  -  Mj)(t3  ~  ta)  -  (t3tj  -  t2tj)(t2  -  ti) 
(ti  -  t1^)(t3  - 12)  -  (ti  -  t;)(t2  -  ta) 


Note  that  the  solution  for  tj  is  consistent  with  the  solution  for  to-  We  seek  <j>  and  t0  such 
that  this  circle  has  radius  e.  So,  substitute  to  into 

(ti(*)-t0)(t;(*)-t;)  =  €2 


and  finally 


(ti  -  to)  = 


(ta  -  ti)(t3  -  t2)(tj  -  tj) 

(ti  -  t;)(t3  - 12)  -  (t3  -  tx)(t5  -  ti) 


(ta  -  ti)(t3  -  t2)(ti  -  ti)(ti  -  ti)(ti  -  ti)(t3  -  t2)  + 
^2[(t;-ti)(t3-t2)-(t2-t1)(ti-ti)]2  =  o 
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9.2  Topological  Changes  in  Translation  Space 

We  claimed  there  were  only  three  events  which  characterize  the  way  in  which  the  topol¬ 
ogy  of  intersection  of  match  circles  change  as  circles  follow  their  circular  orbits  as  functions 
of  the  rotation,  <j>,  applied  to  the  model  features.  The  reason  for  this  is  fairly  simple.  Each 
distinct  cell  in  translation  space  is  bounded  by  one  or  more  circles,  and  their  intersection 
points.  To  change  the  topology,  one  cell  must  move  into  or  out  of  another.  To  do  so  their 
boundaries  must  cross.  When  this  happens  either  two,  three,  or  more  circles  intersect  at 
a  single  point.  But  if  more  than  three  intersect  at  a  single  point,  any  subset  of  three  will 
do  to  solve  the  equations.  There  are  also  limiting  cases  where  the  cells  of  interest  are  just 
points,  but  these  are  handled  as  well.  The  case  where  two  circle  are  coincident  is  the  other 
special  case  which  must  be  considered  for  topological  effect. 


9.3  Complex  Numbers  and  2D  vectors 


We  find  it  convenient  mathematically  to  consider  points  in  the  plane  as  complex  num¬ 
bers.  With  this  representation,  rotation  about  the  origin  by  an  angle  <p  corresponds  to 
multiplication  by  the  complex  exponential  e‘*.  If  3  =  ( ax,Oy)T  and  b  —  {bx,by)T,  as 
complex  numbers  we  have 

a  *-*  a  =  ax  +  idy 

b  b  =  bx  +  iby. 


We  also  have 


_  -  ab*  4-  a*b 
a  ■  b  =  - - - 


-  i(ab*  -  a*b) 
0x6=  - 


where  x*  denotes  the  complex  conjugate  of  x. 


9.4  Constructing  The  Initial  Match-Sets 

A  first  step  in  the  algorithm  is  to  determine  the  intersection  topology  of  the  match- 
circles  for  some  initial  rotation  <j>o,  and  construct  the  associated  match-sets.  This  can 
be  done  as  follows.  Compute  the  intersection  points  of  all  the  circles,  by  first  using  an 
algorithm  to  intersect  their  bounding  squares  in  0(Nlg  N)  4-  K  time  for  N  circles  with  K 
intersection  points.  For  each  circle  we  sort  its  intersection  points  with  other  circles  by  angle. 
Because  the  circles  are  the  same  size,  if  the  interior  of  two  circles  intersects,  then  the  circles 
must  intersect  (excluding  the  special  case  of  concentricity).  Thus  if  we  step  through  the 
intersection  points  in  order  of  angle  keeping  track  of  when  we  enter  or  leave  other  circles, 
in  two  passes  around  each  circle  we  can  construct  the  set  of  match-sets  associated  with  its 
intersection  points. 
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